On Retaining Intermediate Probabilistic Models When Building Bayesian Networks

نویسندگان

  • Prashant J. Doshi
  • Lloyd G. Greenwald
  • John R. Clarke
چکیده

The process of building a Bayesian network may occur in stages, in which intermediate Bayesian networks are built during preliminary processing and then used in the construction of further Bayesian networks. For example, in (Doshi, Greenwald, & Clarke 2001) we describe a way to use Bayesian networks to model and correct errors in noisy datasets. The corrected datasets are then used in (Doshi 2001) to build predictive Bayesian networks. Through this process we built networks that capture probabilistic relationships between 412 fields of data from 169,512 patients admitted to trauma centers in Pennsylvania and registered in the Pennsylvania Trauma Systems Foundation Trauma Registry between 1986 and 1999. In the process mentioned above, intermediate Bayesian networks were used to find the most likely values for fields found to have errors. These most likely values were then used in the cleansed dataset. However, in the subsequent process of building Bayesian networks from this dataset, we questioned whether or not these intermediate networks used in error correction should have been retained. In other words, we wanted to understand the tradeoffs involved in retaining the distributional information summarized in each error-correction network rather than just retaining the most likely value for each corrected field. This question can be generalized to any process of building a Bayesian network in stages. This note describes preliminary work to understand these issues. An important component of this staged network building process is that common variables are represented from one stage to the next. In data cleansing, variables used to query for error distributions are the same variables that are used as evidence variables in the final predictive network. Furthermore, the context variables used to model errors are also represented directly in the final network. Retaining distribution information can be accomplished by employing networks from early stages within the subsequent networks. Common variables limit the potential blow-up in network size.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Rule-based joint fuzzy and probabilistic networks

One of the important challenges in Graphical models is the problem of dealing with the uncertainties in the problem. Among graphical networks, fuzzy cognitive map is only capable of modeling fuzzy uncertainty and the Bayesian network is only capable of modeling probabilistic uncertainty. In many real issues, we are faced with both fuzzy and probabilistic uncertainties. In these cases, the propo...

متن کامل

Load-Frequency Control: a GA based Bayesian Networks Multi-agent System

Bayesian Networks (BN) provides a robust probabilistic method of reasoning under uncertainty. They have been successfully applied in a variety of real-world tasks but they have received little attention in the area of load-frequency control (LFC). In practice, LFC systems use proportional-integral controllers. However since these controllers are designed using a linear model, the nonlinearities...

متن کامل

A Hybrid Bayesian Network Modeling Environment

Bayesian networks are a powerful method for building probability models. But the formalism does not support incremental model development and reuse of models. This is partly due to the fact that Bayesian networks require precise probability values, while incremental model development and model reuse require the ability to abstract probability information. We present a formalism called hybrid Ba...

متن کامل

Probabilistic Contaminant Source Identification in Water Distribution Infrastructure Systems

Large water distribution systems can be highly vulnerable to penetration of contaminant factors caused by different means including deliberate contamination injections. As contaminants quickly spread into a water distribution network, rapid characterization of the pollution source has a high measure of importance for early warning assessment and disaster management. In this paper, a methodology...

متن کامل

From Probabilistic Horn Logic to Chain Logic

Probabilistic logics have attracted a great deal of attention during the past few years. Where logical languages have, already from the inception of the field of artificial intelligence, taken a central position in research on knowledge representation and automated reasoning, probabilistic graphical models with their associated probabilistic basis have taken up in recent years a similar positio...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001